International Journal of Advanced Engineering Research and Science (IJAERS) 
https: //dx. doi. ora/10. 221 61/iiaers/3. 11.22 


[Vol-3, Issue-11, Nov- 2016] 
ISSN: 2349-6495(P) / 2456-1908(0) 


A Comparative Study of Text Summarization 
Based on Synchronous and Asynchronous PSO 

R. Pallavi Reddy 1 , Kalyani Nara 2 , S. Sravani Reddy 3 

'Assistant professor. Dept of CSE, GNITS, Hyderabad, Telangana, India. 

2 Professor, Dept of CSE, GNITS, Hyderabad, Telangana, India. 

3 PG Scholar, Dept of CSE, GNITS, Hyderabad, Telangana, India. 


Abstract — Text summarization is the process of 
extracting the most important sentences from the original 
document without its meaning change. The paper focus 
on Extractive summarization technique which chooses the 
important sentences from the document and integrates 
into summary. An extractive summarization technique. 
Particle swarm Optimization performs arithmetic 
operations that enhances a problem, by iteratively trying 
to improve possible solution with regard to input data. It 
determines a problem by having a population of possible 
solutions moving around the search space according to 
arithmetic formulae over the particles position and 
velocity. The sequence of modernized particles of PSO 
can be categorized into Synchronous PSO (S-PSO) and 
Asynchronous PSO(A-PSO). In synchronous PSO, after 
calculating the whole performance, velocities and 
positions of the particles are modernized, this increases 
the performance. In A-PSO after calculating its 
performance, velocities and positions of the particles are 
modernized using partial data which leads to extreme 
analysis. The comparative study on the synchronous PSO 
and asynchronous PSO with the precision and recall 
values for different datasets is considered. Asynchronous 
PSO has higher precision and recall values compared to 
synchronous PSO. Asynchronous PSO leads to extreme 
analysis of data. 

Keyword — Text Summarization, particle swarm 

optimization, Synchronous PSO (S-PSO), Asynchronous 
PSO (A-PSO). 

I. INTRODUCTION 

Text summarization is the process of distilling the most 
important information from the source document to 
produce a abridged version of text. Automatic text 
summarization is to present the input text into a summary. 
The main advantage of using a summary is abating the 
reading time. Text summarization techniques can be 
classified into extractive and abstractive summarization. 
An extractive summarization method elites important 
sentences, paragraphs etc. from the original document and 
concatenating them into short data. An Abstractive 
summarization is an adapting of the main concepts in a 


document and then expresses those concepts in clear 
natural language. 

Generally Extraction methods use sentence extraction 
technique to create the summary. In 1995 Kennedy and 
Eberhart introduced Particle swarm optimization (PSO) 
[1]. PSO is stochastic optimization algorithm depends on 
the swarm that simulate the social behavior of organisms 
such as birds and fishes. These organisms’ benefits in 
search for food sources through distinctive work with 
neighbors. In PSO, the distinctive agents depicted by a 
swarm are called particles. The particles move within the 
search space to find the optimum solution by modernizing 
their velocity and position. These values are affected by 
the participation of the particles. PSO has drawn a lot of 
attentions from the researchers all over the world. PSO 
has sustained many evolutionary processes. Many 
variations of PSO have been proposed to improve the 
performance of the algorithm. The particles update 
sequence effects on the efficiency of PSO. In PSO, after 
evaluating the whole performance the best found solution 
is chosen as PBest from the Particle information. This 
method of PSO algorithm is known as synchronous PSO 
(S-PSO). The update method leads to the exploitation of 
the data. 

In Asynchronous PSO (A-PSO), the position and velocity 
are modernized as soon as a particle’s performance is 
evaluated. Therefore, a particle’s search is directed by the 
partial or flawed information from its neighbor. This 
method leads to distinctness in the swarm [3]. In the 
beginning of iteration, the particles are updated using 
previous iterations while particles are updated at the end 
of the iteration based on the existing iteration [4]. A-PSO 
has been asserted to perform better than S-PSO. Xue et al. 
[8] reported that asynchronous update leads to a shorter 
execution time. Asynchronous method attempt on the 
incomplete information of the current best found solution 
communicated to the particles more slowly, thus lead to 
more exploration. 

A comparative study is performed on the two algorithms 
to determine which algorithm support for a better 
summary. The paper is further organized as follows: The 
text summarization technique correspond to 
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Preprocessing and Feature Extraction as their initial stage. 
These steps are briefly explained in section II. The 
synchronous PSO (S-PSO) algorithm is explained in 
detail in section III. The asynchronous PSO (A-PSO) 
algorithm is highlighted in section IV. Various input 
documents relating to different domains are given as input 
data. The results obtained from algorithms are used to 
calculate precision and recall values. The analysis of 
results are given in section V. The conclusions stated in 
section VI based upon the experimental evaluations from 
section V. 


II. PREPROCESSING AND FEATURE 
EXTRACTION 

A. Preprocessing: 

Preprocessing is important as it provides summarization 
systems with a clean and adequate representation of 
source document. The pre-processing helps in interacting 
the most important information of a document. The text 
file is taken as the input document which is given for pre- 
processing. Pre-processing consists of four main steps: 
Segmentation, stop word removal, tokenization, 
stemming. 

Sentence segmentation is the process of dividing the input 
file into number of sentences. The stop words such as I, a, 
the. .etc. are removed from the segmented lines. After 
stop word removal, each word is divided into tokens, base 
words are obtained by removing the prefixes and 
suffixes. 

B. Feature Extraction 

After Pre-processing, it is subjected to feature extraction 
by which the properties of the sentences are extracted to 
score the sentence. Eight features are considered. Values 
for each Feature are between 0 and l.The eight features 
are: 

Title Feature: 

The sentences that contain title words are important as 
they are more relevant to theme. These sentences have a 
more chance of getting constituted in the summary. The 
title feature ( T F ) can be calculated as below: 


rp, no of title words in sentence A 

1 it = ( 1 ) 

no of title words in title 

Sentence Length: 

Sentence Length (S L ) is important in creating the 
summary. Short sentences such as names, date lines etc., 
are not added to the summary. This feature is used to 
isolate the short sentences. 


_ no of words in sentence 

oi- : \Z) 

no of words in longest sentence 

Term Weight: 

The term occurrences within a document have often been 
used for calculating the weight of each sentence. The 
sentence score can be calculated as the sum of the score 


of words in the sentence. Each word weight is given term 
frequency. The term weight is given by: 

Wt= tfi * isfi = tfi * log ^ (3) 

Tfi :Term Frequency of word i 
N: Number of sentences in the document 
ni :Number of sentences in which the Word i occur 
The Total Term Weight (T w ) is given by the formula 


Tw - 


sf=i wm 


(4) 


Max(Zf =1 wi(sij) 

K: Number of Words in Sentences 

Sentence Position: 

The sentence position (S P ) also plays an important role in 
determining whether the sentence is appropriate or not. If 
there are 5 lines in document the sentence positions are 
given by 

S P = 5/5 for 1st, 4/5 for 2nd, 3/5 for 3rd, 2/5 for 4th, 1/5 
for 5 th (5) 

Sentence to Sentence Similarity: 

Similarity between the sentences is very important in 
generating the summary. The Similar sentences should 
not repeat in the summary that is to be generated. 

X sim(si,sy) 


SS sin 


(6) 


MaxfZ sim(si,sj)) 

Si: sentence i 
Sj: sentence j 

Sim (si, sj): is the similarity of 1 to n terms in 
sentence si and sj 

Proper Noun: 

The sentences which have more proper nouns are mostly 
to be included in the summary. The Proper noun(N P ) 
feature is calculated as below: 

no of proper nouns in sentence 


( 7) 

length of sentence 

Thematic word: 

The terms that occur more frequently are more related to 
the topic. We consider top 10 most frequent words as 
thematic words. Thematic words(W-r) are calculated as 
below: 

no of thematic word in sentence 


w T = 


(8) 


Max{ no of thematic words') 

Numerical Data: 

This Feature is used to identify the statistical data in every 
sentence. Numerical data(D N ) is calculated as follows: 

no of numerical data in senence 


D\ - 


length of sentence 


(9) 


III. PARTICLE SWARM OPTIMIZATION 

Particle swarm optimization performs arithmetic 
operations which enhances a problem by iteratively trying 
to improve possible solution with regard to given input 
data. A conventional approach called synchronous 
method is a more precise natural model which increases 
the possibility of parallelization of an algorithm [7], [8]. 
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In PSO, the search for the optimum solution is directed by 
a swarm of P particles. At time t, the ith particle has a 
position, p(t), and a velocity, V(t). A solution is 
represented by the particle position and velocity. Velocity 
represents the rate of change from the current particle 
position to the next particle position. The position and 
velocity values are initialized by random numbers at the 
beginning. In consecutive iterations, the search process is 
directed by updating the position and velocity using the 
following equations: 

V(t) = Vj(t - 1) + Ciri(pBesti- Xi(t - 1))+ c 2 r 2 (,grBest - 
Xi(t - 1)) (1) 

Xi(t) = V(t) + Xi (t - 1) . (2) 

To prevent the particles from attempting too far from the 
feasible region, the V(t) value is clasped to +Pmax. If the 
value of I/max is too large, then the exploration range is 
too wide. Conversely, if the value of Pm ax is too small, 
then the particles will favor the local search [10]. In (1), 
c, and c 2 are the learning factors that control the effect of 
the logical and social impact on a particle. Typically, both 
C] and c 2 are set to 2. Two independent random numbers 
r i and r 2 ranges from 0.0 to 1.0 are consolidated into the 
velocity equation. These random terms provide 
hypothetical behavior to the particles, thus strengthen 
them to explore a wider area. 

A distinctive progress in PSO influenced not only by the 
particle’s endeavor and experience but also by sharing the 
information to its neighbors. The particle’s involvement is 
represented in equation (1) by pBesti, the best position 
which is found until, by the tth particle. The neighbors’ 
influence is represented by pBest, the best position found 
by the swarm till the current iteration. The particle’s 
position, Xi(t), is updated using equation (2), in which a 
particle’s next search is started from its previous position 
and the new search is involved by the past search[4]. 
Typically, \,(t) is limited to prevent the particles from 
searching in an infeasible region [5]. The quality of x(t) is 
appraised by a problem-dependent fitness function. Each 
particles is evaluated to determine its current fitness. If a 
new position fitness is better than the current fitness then 
pBest or pBesti or both are found, then the new position 
value will accordingly be saved as pBest or pBesti; 
otherwise the old best values will remain same. This 
process continues till the stopping benchmark is met, 
when the maximum iteration limit, T, is attained or the 
target solution is accomplished. Therefore, the maximum 
number of fitness evaluation for a swarm with number of 
particles P in a run is ( PxT ). 
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Fig.l: Synchronous PSO 


The flowchart of figure 1 represents the original PSO 
algorithm. As shown in the algorithm, the updated values 
of the Bestt and pBest are evaluated after the fitness of 
all the particles has been evaluated. Therefore, this 
approach of PSO is known as Synchronous PSO (S-PSO). 
The pBesti and pBest are modernized after all the 
particles fitness is evaluated, S-PSO assure that all the 
particles receive accurate and complete information about 
their neighbors , leads to a better choice of pBest and thus 
allowing the particles to exploit this information so that a 
better solution can be found. The summary is generated 
based the pBest values that are arranged in the 
descending order and the sentences are extracted from the 
source document and concatenated. However, this 
possibly leads the particles in S-PSO to converge faster, 
resulting in a untimely convergence. 

IV. ASYNCHRONOUS PSO (A-PSO) 

In S-PSO, a particle has to wait for the complete swarm to 
be evaluated before it can progress to a new position and 
continue its search. Thus, the particle is idle for the 
longest time after evaluating and waiting for the entire 
swarm to be modernized. A-PSO is an alternative 
approach to S-PSO, in which the particles are modernized 
based on the present state of the swarm. In A-PSO, A 
particle position, velocity, pbest and pbest are 
modernized as soon as its fitness is evaluated. The 
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particle chooses gBest using a combination of 
information from the present and the prior iteration. In A- 
PSO, particle in the same iteration uses various values of 
$Best as it is preferred, based on the accessible 
information during a particle’s updating process. 



Fig. 2: Asynchronous PSO 

The flowchart in Figure 2 represents A-PSO algorithm. 
The flow of A-PSO is unlike S-PSO, however the fitness 
function is evaluated for P times per iteration, once for 
each particle. Therefore, the maximum number of 
iterations for fitness evaluation is (Px'T). This is alike to S- 
PSO. Using the same equations as S-PSO, The velocity 
and position are evaluated. 

Other than the type of information, the lack of 
coexistence in A-PSO resolves the issue of ineffective 
particles faced in S-PSO. An asynchronous update also 
allows the modernize sequence of the particles to alter 
dynamically or a particle to be modernized more than 
once. 

V. ANALYSIS OF RESULTS 

The analysis of the result is done considering the domains 
relating to Economy, Secularism, Earth, Nature, Forest, 
and Metadata. For every document, a manually generated 
relevant summary is compared to obtain the precision and 
recall values. Summary is generated using synchronous 
PSO and asynchronous PSO. Precision, recall and F- 
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measure values are calculated for each document as 
shown in the below tablel. In the graphs, precision, recall 
and F-measure values are represented to determine the 
performance of the systems.. The graph are drawn for 
each dataset as shown in the below. Recall is also known 
as sensitivity. Recall is gradually increasing as shown in 
the figure. The increase in recall suggests that the system 
performs better compared to other systems. Compared to 
synchronous PSO and Asynchronous PSO the recall value 
of Asynchronous PSO is higher than the Synchronous 
PSO. This leads to more exploration of data. 


Table. 1: values of Synchronous PSO(S-PSO) and 
Asynchronous PSO (A-PSO) 



Fig. 3: Comparison of S-PSO and A-PSO for Nature 
dataset 


Page | 128 


International Journal of Advanced Engineering Research and Science (IJAERS) 
https: //dx. doi. ora/10. 221 61/iiaers/3. 11.22 


[Vol-3, Issue-11, Nov- 2016] 
ISSN: 2349-6495(P) / 2456-1908(0) 


metadata 


70 

60 

50 

40 

30 

20 

10 

0 




Synchronous PSO 


Asynchronous PSO 


Fig. 4: Comparison ofS-PSO and A-PSO for Metadata 
dataset 



Fig. 5: Comparison of S-PSO and A-PSO for Forest 
dataset 
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Fig. 6: Comparison of S-PSO and A-PSO for Reservation 
dataset 
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Fig.7: Comparison of S-PSO and A-PSO for Computer 
dataset 
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Fig. 8: Comparison of S-PSO and A-PSO for Economy 
dataset 



Fig. 9: values of synchronous PSO(S-PSO) and 
asynchronous PSO(A-PSO) 
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VI. CONCLUSION 

Automatic summarization is a aggregate task that affects 
the performance to produce high quality summaries. A 
comparative study of synchronous PSO and asynchronous 
PSO summarization techniques are evaluated using 
different text documents related to different domains as 
inputs. In synchronous PSO, after calculating the entire 
performance the particles velocities and positions are 
modernized. This modernizing method improves the 
performances. In A-PSO after calculating the own 
performance, velocities and positions of the particles are 
modernized. Therefore, particles are modernized using 
partial data, leads to extreme exploration. The analysis of 
results show that the Asynchronous approach produces 
efficient results compared to Synchronous approach. The 
work can be further enhanced by using a hybrid approach 
which combines S-PSO and A-PSO. 
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